{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d791bc8a",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/retrievers/vertex_ai_search_retriever.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9075b09-3916-49b0-b565-b0a0091b5872",
   "metadata": {},
   "source": [
    "# Vertex AI Search Retriever\n",
    "\n",
    "This notebook walks you through how to setup a Retriever that can fetch from Vertex AI search datastore"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1b69157",
   "metadata": {},
   "source": [
    "### Pre-requirements\n",
    "- Set up a Google Cloud project\n",
    "- Set up a Vertex AI Search datastore\n",
    "- Enable Vertex AI API\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d42fd6c3",
   "metadata": {},
   "source": [
    "### Install library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b5ac8d10",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-retrievers-vertexai-search"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a313ae1",
   "metadata": {},
   "source": [
    "### Restart current runtime\n",
    "\n",
    "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d621a37",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Colab only\n",
    "# Automatically restart kernel after installs so that your environment can access the new packages\n",
    "import IPython\n",
    "\n",
    "app = IPython.Application.instance()\n",
    "app.kernel.do_shutdown(True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37391183",
   "metadata": {},
   "source": [
    "### Authenticate your notebook environment (Colab only)\n",
    "\n",
    "If you are running this notebook on Google Colab, you will need to authenticate your environment. To do this, run the new cell below. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06647551",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Colab only\n",
    "import sys\n",
    "\n",
    "if \"google.colab\" in sys.modules:\n",
    "    from google.colab import auth\n",
    "\n",
    "    auth.authenticate_user()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d774e9c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# If you're using JupyterLab instance, uncomment and run the below code.\n",
    "#!gcloud auth login"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1fb3b393-1171-4481-b75f-b7626ef7b0b2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.retrievers.vertexai_search import VertexAISearchRetriever\n",
    "\n",
    "# Please note it's underscore '_' in vertexai_search"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bda2c5e0",
   "metadata": {},
   "source": [
    "### Set Google Cloud project information and initialize Vertex AI SDK\n",
    "\n",
    "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
    "\n",
    "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "145f32a6",
   "metadata": {},
   "outputs": [],
   "source": [
    "PROJECT_ID = \"{your project id}\"  # @param {type:\"string\"}\n",
    "LOCATION = \"us-central1\"  # @param {type:\"string\"}\n",
    "import vertexai\n",
    "\n",
    "vertexai.init(project=PROJECT_ID, location=LOCATION)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d638b07e",
   "metadata": {},
   "source": [
    "### Test Structured datastore"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "81a785c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "DATA_STORE_ID = \"{your id}\"  # @param {type:\"string\"}\n",
    "LOCATION_ID = \"global\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e47e78b",
   "metadata": {},
   "outputs": [],
   "source": [
    "struct_retriever = VertexAISearchRetriever(\n",
    "    project_id=PROJECT_ID,\n",
    "    data_store_id=DATA_STORE_ID,\n",
    "    location_id=LOCATION_ID,\n",
    "    engine_data_type=1,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "80d0361b",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"harry potter\"\n",
    "retrieved_results = struct_retriever.retrieve(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bbfc0fe3-7c64-4d5d-8190-f80e31d35b4c",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(retrieved_results[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f9ac98d",
   "metadata": {},
   "source": [
    "### Test Unstructured datastore"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de6d219f",
   "metadata": {},
   "outputs": [],
   "source": [
    "DATA_STORE_ID = \"{your id}\"\n",
    "LOCATION_ID = \"global\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "22b79af5",
   "metadata": {},
   "outputs": [],
   "source": [
    "unstruct_retriever = VertexAISearchRetriever(\n",
    "    project_id=PROJECT_ID,\n",
    "    data_store_id=DATA_STORE_ID,\n",
    "    location_id=LOCATION_ID,\n",
    "    engine_data_type=0,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29fd9553",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"alphabet 2018 earning\"\n",
    "retrieved_results2 = unstruct_retriever.retrieve(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb5dc435",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(retrieved_results2[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df30f7fc",
   "metadata": {},
   "source": [
    "### Test Website datastore"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "831f4dc4",
   "metadata": {},
   "outputs": [],
   "source": [
    "DATA_STORE_ID = \"{your id}\"\n",
    "LOCATION_ID = \"global\"\n",
    "website_retriever = VertexAISearchRetriever(\n",
    "    project_id=PROJECT_ID,\n",
    "    data_store_id=DATA_STORE_ID,\n",
    "    location_id=LOCATION_ID,\n",
    "    engine_data_type=2,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1a0b7c95",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"what's diamaxol\"\n",
    "retrieved_results3 = website_retriever.retrieve(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "566cc68b",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(retrieved_results3[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d4f7dc5-bfcd-4f04-8475-e2a6125d70bd",
   "metadata": {},
   "source": [
    "## Use in Query Engine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "416987b4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# import modules needed\n",
    "from llama_index.core import Settings\n",
    "from llama_index.llms.vertex import Vertex\n",
    "from llama_index.embeddings.vertex import VertexTextEmbedding"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9b1e8ec8",
   "metadata": {},
   "outputs": [],
   "source": [
    "vertex_gemini = Vertex(\n",
    "    model=\"gemini-1.5-pro\",\n",
    "    temperature=0,\n",
    "    context_window=100000,\n",
    "    additional_kwargs={},\n",
    ")\n",
    "# setup the index/query llm\n",
    "Settings.llm = vertex_gemini"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a4c879f7-7554-4822-94df-cfe008a03a68",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.query_engine import RetrieverQueryEngine\n",
    "\n",
    "query_engine = RetrieverQueryEngine.from_args(struct_retriever)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "471102ba-aed8-435d-9ddc-de51644aa886",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = query_engine.query(\"Tell me about harry potter\")\n",
    "print(str(response))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "you-llamaindex",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
